Spectral Active Clustering via Purification of the k-Nearest Neighbor Graph
نویسندگان
چکیده
Spectral clustering is widely used in data mining, machine learning and pattern recognition. There have been some recent developments in adding pairwise constraints as side information to enforce top-down structure into the clustering results. However, most of these algorithms are “passive” in the sense that the side information is provided beforehand. In this paper, we present a spectral active clustering method that actively select pairwise constraints based on a novel notion of node uncertainty rather than pair uncertainty. In our approach, the constraints are used to drive a purification process on the k-nearest neighbor graph—edges are removed from the graph based on the constraints—that ultimately leads to an improved, constraint-satisfied clustering. We have evaluated our framework on three datasets (UCI, gene and image sets) in the context of baseline and state of the art methods and find the proposed algorithm to be superiorly effective.
منابع مشابه
FUZZY K-NEAREST NEIGHBOR METHOD TO CLASSIFY DATA IN A CLOSED AREA
Clustering of objects is an important area of research and application in variety of fields. In this paper we present a good technique for data clustering and application of this Technique for data clustering in a closed area. We compare this method with K-nearest neighbor and K-means.
متن کاملPartitioning Well-Clustered Graphs: Spectral Clustering Works!
In this work we study the widely used spectral clustering algorithms, i.e. partition a graph into k clusters via (1) embedding the vertices of a graph into a low-dimensional space using the bottom eigenvectors of the Laplacian matrix, and (2) partitioning the embedded points via k-means algorithms. We show that, for a wide class of graphs, spectral clustering algorithms give a good approximatio...
متن کاملSpectral Clustering Based on k-Nearest Neighbor Graph
Finding clusters in data is a challenging task when the clusters differ widely in shapes, sizes, and densities. We present a novel spectral algorithm Speclus with a similarity measure based on modified mutual nearest neighbor graph. The resulting affinity matrix reflex the true structure of data. Its eigenvectors, that do not change their sign, are used for clustering data. The algorithm requir...
متن کاملMultilingual Spectral Clustering Using Document Similarity Propagation
We present a novel approach for multilingual document clustering using only comparable corpora to achieve cross-lingual semantic interoperability. The method models document collections as weighted graph, and supervisory information is given as sets of must-linked constraints for documents in different languages. Recursive k-nearest neighbor similarity propagation is used to exploit the prior k...
متن کاملFast PNN-based Clustering Using K-nearest Neighbor Graph
Search for nearest neighbor is the main source of computation in most clustering algorithms. We propose the use of nearest neighbor graph for reducing the number of candidates. The number of distance calculations per search can be reduced from O(N) to O(k) where N is the number of clusters, and k is the number of neighbors in the graph. We apply the proposed scheme within agglomerative clusteri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012